11,388 results • Page 2 of 228
Im wondering about the most straightforward way to extract the interval information contained in a fasta header such as the one below, thanks! Also maybe to pipe into a newly created bed file. >Mouse|chr12:112380949-112381824
updated 6.3 years ago • rbronste
I am sure that someone will do this work faster and better than me. I would like to edit multiple fasta header from this format. >M01380:50:000000000-AV1DH:1:1101:16094:3001 1:N:0:M636:16S_V1V3 TTCTGCCT|0|TAGACCTA|0 CS1_534R_YM3_for...3|27| to this one: >M636 As you can see "M636" is embedded in the mayor header. Thank you for always helping everybody! D
updated 6.9 years ago • DVR
I want to extract **gene name** , **gene start position** and **gene stop position** from the fasta header of the fasta file. I have tried to extract based on the position but those locations are not consistent. Is there...and 17th element from this list. It works for this particular example. This does not work for other headers where these positions are different. Usually, gene name is consisten…
updated 3.9 years ago • lokraj2003
Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in...Hello! I have a FASTA file and I need a script that read in the file, changes all the headers to e new format and writes out all the sequences in a new output file. The modified headers should contain, for each sequence, the species name (with "_" r…
updated 6.4 years ago • mpbiology.dna
Hi all, I would like to rename the headers of my fasta with a list of IDs. For each sequence I have this type of header . >range=chr16:946803-947997 In a separated...txt file I have a list of IDs, in the same order than the sequences, that I would like to use as headers. I guess a simple approach based on bash/awk/sed should work, but I couldn't manage to do so. Cheers
updated 16 months ago • Bertrand
How to take a specific column in sequence header identifiers of fasta file? I am having my header such as: ``` >PGM0100236.1 [Candida] scaffold00238 >PGM0100236.1 [Candida...scaffold00241 ``` I would like to take my third column alone i.e scaffold00238 for all the headers in my fasta file. Please give a simple command solution. I am new to bioinfo and linux script. Thank you
updated 19 months ago • palani
Dear all, I want to add a special character "/1" to eacf of fasta header (at the end of fasta header) in a 8.5 GB fasta file. I used following command; perl -p -e 's/^(>.*)$/$1-New_Header_info/g' input.fasta
updated 9.3 years ago • vahapel
Hi Everyone, I have a .fasta file with functional and GO annotations. I also have an associated GFF3 file with the locations of these genes in the...genome. The IDs from the GFF3 and fasta match. I need to append the annotation information from the fasta header to the notes column of the appropriate lines...in the GFF3 file. something like this: Fasta Headers: >evm.model.Scaffold_…
updated 3.0 years ago • EJB
Hi FASTA header looks like: >1570-13.segment.flu1_PB2 >1570-13.segment.flu2_PB1 >1570-13.segment.flu3_PA etc Filenames...looks like: 201301234.fasta I want to have FASTA headers that looks like: >201301234_PB2 >201301234_PB1 >201301234_PA I have seen this answer: https://www.biostars.org
updated 5.0 years ago • SaltedPork
In a typical FASTA file, how can the header be used as its filename (i.e., replace the current file name with header ID) ? I have multiple such FASTA
updated 6.3 years ago • cerulean
I have a fasta files that has more than 2.7 million headers. I want to break it into chunks. >gene1 ACTG... >gene2 ATTT... ... >gene2,700,000...grep -n "^>" my.fasta > headersofmy.fasta This gives me the positional information of the headers. 1:>gene1 4:>gene2 11:>gene3 ... n:&…
updated 5.7 years ago • sicat.paolo20
My fasta headers of my FASTA file go like this: ``` >M02529:151:000000000-AJBNG:1:1101:20806:3573:133 TGGGGAATTGTTCGCAATGGGCGCAAGCCTGACGACGCAACGCC...The "133" is the sample name, and I need it at the beginning of the header followed by a dot, like this: ``` >:133.M02529:151:000000000-AJBNG:1:1101:20806:3573 TGGGGAATTGTTCGCAATGGGCGCAAGCCTGACGACGCAACGCC...I would be glad to get a 's…
updated 24 months ago • fibar
Hi! So I have a FASTA file containing sequences, I want to replace old FASTA headers with new ones, and the first step to do so is to match with...the header names. It's the name I want the match with, so after the '>'. How do I do this? All sequences have headers somewhat like this...gt;Halobacterium_salinarum This is the part of the code where I find the headers: while (my $l…
updated 5.5 years ago • Mimmi Ahlmén
Hi everyone, I've been trying to edit the headers of my fasta file which is intend to upload on NCBI TSA. Can't seem to successfully upload my file on TSA and if im not mistaken...it could be because of the header format. The headers of my fasta file are as below: >TRINITY_DN1078649_c1_g1_i1 len=235 path=[0:0-234] >TRINITY_DN1078643_c0_g1_i1
updated 7 months ago • sumitra.20
Hey guys, I have tons of protein multi-fasta files and I would like to append the name of the file to the fasta-headers. For example, for a input file one.txt with the...headers >1 ATGC... >2 ATGCAT... I would like to have the output >one_1 ATGC... >one_2 ATGCAT... I use bbrename for DNA sequences, but
updated 4.1 years ago • genomes_and_MGEs
Hi All, I want to remove empty fasta headers from the fasta file. I used the commands from [this biostar post][1] but they seems to work only for nt sequence file
updated 2.6 years ago • GP
Hi, I have very little experience with scripts. I want to change my FASTA sequence headers (I have 100's of FASTA sequences per file) from very long headers to headers with the sample name (CM1) and
updated 6.4 years ago • mollysil
Hi everyone, I have a multi-fasta file name multi.fasta with the following structure: >A 124 B ATCGTA... >C 567 D GTCAG... My goal is to create a new file, with...the new fasta-headers containing only the first and last column. If I use awk -F" " '/>/ {print $1,$(NF)}' multi-fasta > modified_multi-fasta This...will print the fasta headers with …
updated 22 months ago • genomes_and_MGEs
I have a concatenated fasta file for a series of genbank entries with different headers. I need to edit the fasta headers to all say "BCH" in place of...the header up to and including the space after "Archilocus alexandri". For example, DQ432746.1 Archilochus alexandri voucher
updated 4.6 years ago • selplat21
there must be a solution somewhere to my issue, but until now I could not find it. I have a list of fasta headers that I want to use to select a subset of genes from a fasta file that was created using the RAST annotation pipeline...The headers look like this: ``` 160798.5.peg.2 160798.5.peg.12 160798.5.peg.123 160798.5.peg.1234 ``` My problem is that if I use...to do this with sed or awk, but…
updated 2.4 years ago • thhaverk
I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution
updated 9.0 years ago • Palu
Hi all I have a fasta file that i want to extract just header of sequences. is there any perl code or some thing like this to do that? thanks a lot
updated 12.4 years ago • Mohammad Reza Bakhtiarizadeh
Hi everybody, I have two fasta file with two kinds of header format, I want to replace sequences of interest in one file based on header name (I have two...list of headers for two fasta files as txt format). Could you please advise me what should I do? For example header format for file 1 and
updated 17 months ago • seta
taxonomic information assigned for each one of these MAGs but for downstream analysis I need the fasta headers to contain the taxonomic information that GTDB-tk assigned. This is how the fasta headers of one of the MAGs looks...if there is a way of extract the full taxonomy of the following table and give it to the respective fasta headers of a MAG: ![sample_table][1] So this is the desired ou…
updated 22 months ago • v.berriosfarias
In a multifasta file the fasta header having full details as follows: ">ENSMUSG0000005892|ENSMUST00000004524351|xclkvsldjldjkfklasdfjalsjk
updated 10.8 years ago • Abdul Rawoof
I would like to filter my fasta file using a regexp on header. For exemple, keep only sequence where size != 0 >A1;size=43 ACGTATATATATATATATAT >A1;size
updated 7.8 years ago • sacha
Aloha! I have a fasta file that looks like the following: >FFSA34B_100_M7_ID10014 ATCTAACAATGTTGCTCATGCAGGCCCTGCAGTAGATTTAACCATTCTATCCCTTCACCTAGCAGGTGTATCCTCCTTAATAGGAGCCATCAATTTTACAACTACTATTGCTAACAGACGTTTAGAAGGTATACCTACAGAAAAAATACCCTTATTTATT...43 And I would like to append the second column of the text file to the matching fasta header to produce the following output: &…
updated 5.5 years ago • timmers
I need to reformat headers in a fasta file with headers such as: >Agaricus_chiangmaiensis|JF514531|SH174817.07FU|reps|k__Fungi;p__Basidiomycota
updated 6.4 years ago • jack1120
Hi everyone; this is my first question on the forum. How can I compare if two fasta files contain the same sequence headers? Does any BioPython module exist for doing this? Thanks in advance, peixe
updated 12.9 years ago • Peixe
seq_file = sys.argv[1] labels = seq_file.split(".") # converting the file from fastq to fasta SeqIO.convert(seq_file,"fastq",labels[0]+".fasta","fasta") # taking the converted file and then changing the fasta header handle...used seq_record.description = "" # this strips the old header out SeqIO.write(seq_record, handle,"fasta") handle.close(…
updated 8.2 years ago • skbrimer
Hi, I would like to parse a fasta file and get all headers and seqs that match some strings (so called pattern below). What happens is that all the headers...import re # file with FASTA sequence infile = "seq.fa" # File looks…
updated 6.7 years ago • David
Hey everyone, I have a multi-fasta file like this: >NC_000914 464618..534825 gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac...Hey everyone, I have a multi-fasta file like this: >NC_000914 464618..534825 gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac gtgccttccattttggagcgggaccaaatcgcagcggttctggtaagtgcgagcagggac...I would like to remove w…
updated 2.7 years ago • genomes_and_MGEs
Hi, I am trying to remove the last 5 characters from my FASTA header in my sequencing data. I have ≈400,000 sequences and have tried to use sed command in terminal to do this for me. Input...gt;1-4 TAGGGAGA How can I use sed command to remove the last 5 characters from my FASTA headers
updated 4.1 years ago • angela1
files: - The first is a tab file. In its first column i have list of location and description of fasta sequence in the 2nd column. - The second is a multi fasta file. Some sequences begin with a normal header and others with...the location in it. I'd like to compare the two files and replace the "LOC" header in the multi fasta with the location and the corresponding description in the tab fi…
updated 7.0 years ago • Amy
Hai I have a Fasta file like **GCA_001609185.1_ASM160918v1_genomic.fsa** and i want to change header of this fasta file like this **>GCA_001609185.1_ASM160918v1_genomic
updated 7.8 years ago • akhilvbioinfo
Hey guys, I have a multi-fasta file containing several extracted regions, such as >NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca...32476 tgcagaagtaagggggtaacaccatgcct... ... I would like to include strain name on fasta header, such as >Enterobacter_sp._MGH_6_NZ_KI973281.1_1234..56789 atattgagctaaaaaaatcagttttccca... >Enterobacter_horm…
updated 5.2 years ago • genomes_and_MGEs
Pls help, anyone know how to extract multiple header from a fasta file using perl
updated 7.6 years ago • fongsiongshawn
Here is an example of the header from the FASTA file >PSR83604 cdna supercontig:Red5_PS1_1.69.0:ps1sf1427:11608:20559:-1 gene:CEY00_Acc33586 gene_biotype...to keep just the 'description' part which contains the protein name and remove rest of them from the header. I tried using 'sed', but I'm going wrong somewhere. Can someone help? Thank you in advance
updated 3.9 years ago • Vignesh
I would like to change the fasta header For example, I have following sequences ``` >NR_130660.1 Hanseniaspora uvarum CBS 314 ITS region; from TYPE material
updated 14 months ago • fastamasterfromnow
Hi I have around 85 gene sequences in individual fasta files. I'd like to rename each file with their header name containing the gene name in [gene=]. For each header, I only want what...is in-between the brackets. I'm trying to do this through linux commands. in fasta file input ``` >lcl|NC_018552.1_cds_YP_006666009.1_1 [gene=rps12] [locus_tag=C329_pgp044] [db_xref=GeneID:13540299...gb…
updated 5 months ago • sebabiokr
Hi I have a fasta file anotated and I want to add to the first position after > the next word to 'Similar to' >_Anouracaudifer_00017283...2\1\2/' file.txt > new_file_2.txt` and store it in a new file and tried to paste it into the headers but it does not work , any ideas
updated 12 months ago • Diana Nadia
Hello, I am creating a custom TE library and need fasta file headers to be in a specific format. If I have file 1 with headers like such: >L2-10_EL__1_000087d4-94a9-4af9-a82b...Hopefully that makes sense? I'm worried that this isn't possible due to the odd format of the fasta file headers. Thank you in advance
updated 3.1 years ago • Ava
Hello! I have a FASTA file and I want to change their headers into a new name. Through searching here on this platform I have found some relevant...but not saved. Also a huge portion of sequences is removed...and I do the first sequences their headers are not named..any idea what could be the problem. I am using Linus konsol. My input sequences > LTR-12 ATTGGAAAACAAACTATCCTACCTTC…
updated 3.6 years ago • lukhanyomakhabane
Hello I have a .ffn file having 1000 sequences. I wanted to check whether all the fasta headers have sequence underneath them, and it would be great if I also get to know about the fasta headers which do not
updated 22 months ago • utkarsh.sood
need from a database using the following bioperl code: use strict; use Bio::SearchIO; use Bio::DB::Fasta; my ($file, $id, $start, $end) = ("secondround_merged_expanded.fasta","C7136661:0-107",1,10); my $db = Bio::DB::Fasta->new($file); my $seq = $db-&gt...seq($id, $start, $end); print $seq,"\n"; Where the header of the sequence I'm trying to extract is: C7136661:0-107, as in the …
updated 4.4 years ago • jason.r.gallant
Hi I need help writing a command to remove part of a header from my scaffold fasta file. I have headers that look like >scaffold3247|size3454 TTATATAACTAATTAGATAAAATAGCTAATAATAAAAGCTTCTATATAACTAGCCTTCTTTTAATCTATATAATAAGCTTAGCTAATAAAAAGGCCCACT
updated 18 months ago • kcl58759
Hi I have thousands(1000's) of fasta files in one directory, I want to replace all the fasta file headers with the same keyword **>Gast_superba** ?? suggestions
updated 2.1 years ago • sunnykevin97
Hi! I have two files: one is protein fasta file (`a.fa`) & another is `header.txt`. I want to get my sequences in the same order as the header file. How can I do this
updated 5 months ago • Nelo
Hi everyone, I'm encountering a problem with too long fasta headers. They get truncated at the 20th position by a program (TargetP) I'm using. Example: ``` >ConsensusfromContig10000...entries named "ConsensusfromContig1". Is there any software or any script I can use to rename the headers in a way that they are 20 characters long and still able to get identified? I have only found scri…
updated 16 months ago • branokdrung
Hi there, I have fasta files with header @AS500187:87:J5LBGHRXX:2:11101:7742:1046 1:N:0:21 I want to replace `@AS500187:87:J5LBGHRXX:2` with `@AS500187
updated 22 months ago • bnina9999
11,388 results • Page 2 of 228
Traffic: 1948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6